Genetic characterization of a novel pheasant-origin orthoreovirus using Next-Generation Sequencing

A field isolate (Reo/SDWF /Pheasant/17608/20) of avian orthoreovirus (ARV), isolated from a flock of game-pheasants in Weifang, Shandong Province, was genetically characterized being a field variant or novel strain in our recent research studies in conducting whole genome sequencing by using Next-Generation Sequencing (NGS) technique on Illumina MiSeq platform. Among a total of 870,197 35-151-mer sequencing reads, 297,711 reads (34.21%) were identified as ARV sequences. The de novo assembly of the ARV reads resulted in generation of 10 ARV-related contigs with the average sequencing coverage from 1390× to 1977× according to 10 ARV genome segments. The complete genomes of this pheasant-origin ARV (Reo/SDWF /Pheasant/17608/20) were 23,495 bp in length and consist of 10 dsRNA segments ranged from 1192 bp (S4) to 3958 bp (L1) encoding 12 viral proteins. Sequence comparison between the SDWF17608 and classic ARV reference strains revealed that 58.1–100% nucleotide (nt) identities and 51.4–100% amino acid (aa) identities were in genome segment coding genes. The 10 RNA segments had conversed termini at 5’ (5’-GCUUUU) and 3’ (UCAUC-3’) side, which were identical to the most published ARV strains. Phylogenetic analysis revealed that this pheasant ARV field variant was closely related with chicken ARV strains in 7 genome segment genes, but it possessed significant sequence divergence in M1, M3 and S2 segments. These findings suggested that this pheasant-origin field variant was a divergent ARV strain and was likely originated from reassortments between different chicken ARV strains.


Introduction
Viruses of the family Reoviridae have segmented dsRNA genomes and are classified into two subfamilies comprising a total of 15 genera [1]. The  (NBV), and Avian orthoreovirus (ARV) [2,3]. ARV is non-enveloped virus with icosahedral symmetry and contains a surface protein arranged in a double shell [4]. Virus particles have an average size of 70-80nm and are wrapped around 10 genomic fragments. Depending on their movement in electrophoresis, linear genomic fragments were molecularly divided into three different groups, including three large fragments (L1-L3), three medium fragments (M1-M3) and four small fragments (S1-S4) [5,6]. Besides the tricistronic S1 segment, all ARV genome segments are monocistronic and all genome segments encode 8 structural proteins (λA, λB, λC, μA, μB, σA, σB and σC) and 4 nonstructural proteins (μNS, p10, p17 and σNS) [7]. Each ARV coding gene was flanked by 5' and 3' non-translated regions and the first seven bases (5'-GCUUUUU) and the last five bases (UCAUC-3') of segment termini were found are highly conserved in known ARV strains [8]. ARV is a highly contagious virus involved in a variety of disease conditions or syndromes in poultry, of which viral arthritis/tenosynovitis is the most classic leg lameness or weakness seen in ARV-affected young broiler chickens [3,9]. Since the 1980s, with the rapid development of the modern poultry industry, new symptoms or newly observed disease problems associated with ARV infections have been continuously reported, such as runting-stunting syndrome (RSS) [10], immunosuppression [11], hepatitis [12], malabsorption/maldigestion syndrome [13], respiratory disease [14], and enteric disease [15]. In recent years in Pennsylvania, commercial poultry flocks suffered viral arthritis/tenosynovitis have been increasingly diagnosed and field variant strains of the newly emerging ARVs were confirmed as the causative agent [16,17]. During the same period, highly pathogenic ARV variants emerged in turkey in Midwestern United States [18,19], as well as in other countries [18][19][20]. Similarly, an increasing number of cases of arthritis in broilers caused by ARV has occurred in China, and the ARV field variants remain higher genetic diversity and virulence in flocks, which caused considerably economic losses in the poultry industry. Most of the emerged variants showed common features of genome segments reassortments with historical ARV strains and high genetic diversity in σC genes [21].
Many research studies on ARV infections in various avian species, especially domestic poultry, have been well documented, such as broiler breeders [22], layer breeders [23], broilers [24], geese [25,26], turkeys [27,28], ducks [29][30][31], pigeons [32], and quails [33][34][35]. Although ARV transmissions commonly occur within and between avian species, the importance of wild birds as reservoirs of ARV transmission source to domestic poultry infections was not well studied until recent years due to the difficulty of wild bird sampling [36,37]. Until now, there were only two pheasant cases of ARV infections were reported in 1990s, one case ARV strain was associated with hepatopathy symptoms and the other was associated with tenosynovitis [38], however, there is no previous report of pheasant being infected by ARV in China. In the present study, we report our findings of isolation and full-genome characterization of a novel pheasant-origin ARV field variant strain.

Virus isolate
The pheasant ARV isolate (Reo/SDWF /Pheasant/17608/20) in this study was isolated from tendon tissue of a pheasant case with hepatopathy symptoms at 2-4 weeks of age. The ARV isolation was made in LMH cell cultures and produced giant or bloom-like cytopathic effect (CPE) cells, which were characteristic to ARV and confirmed positive for ARV by fluorescent antibody (FA) test using ARV conjugated antibody (ID No. 680 VDL 9501, NVSL, Ames, IA, USA) which described in our previous study [39]. The ARV cell culture material was subsequently tested negative for other avian viruses which could cause the pheasant leg lesions such as avian influenza virus, Newcastle disease virus, fowl adenovirus type 1 and rotavirus. This pheasant ARV was propagated in LMH cell cultures, tittered as 10 8.5 TCID 50 /mL, aliquoted and stored at -80˚C freezer for this study.

RT-PCR and Sanger sequencing
Viral RNA extraction of the pheasant ARV was performed by using MiniBEST Universal RNA Extraction Kit (TaKaRa, Dalian, China) per the manufacturer's instructions. Conventional RT-PCR reaction was carried out with P1/P4 primers which corresponding to 3' end of S1 segment (σC gene) of ARV [39] by using the One Step RT-PCR Kit Ver.2 (TaKaRa, Dalian, China). RT-PCR products were observed by 1% agarose gel electrophoresis and the 1088bp band was purified by using gel extraction kit (OMEGA, D2500-02 Gel Extraction Kit, USA) following the manufacturer's instructions. The concentration of the purified DNA was confirmed by using a NanoDrop™1000 (DeNovix DS-11, USA) spectrophotometer and then submitted to Personalbio for Sanger sequencing.

Next-generation sequencing
NGS was carried out on a Miseq platform. Total RNA samples were processed by MiniBEST Universal RNA Extraction Kit (TaKaRa, Dalian, China) to build cDNA library. Briefly, the total RNA was fragmented into small pieces using magnesium divalent cations under elevated temperature [40]. The cleaved RNA fragments were copied into the first strand cDNA using reverse transcriptase (Invitrogen, Grand Island, NY, USA) and random primers. The second strand cDNA synthesis was followed using DNA polymerase I and RNase H but without the initial poly A enrichment step. Then these cDNA fragments were assessed by Direct Detect™ bioanalyzer system (DeNovix DS-11, USA) to test the fragments distribution. Thereafter, the prepared cDNA library was loading on the Miseq sequencer to get the raw NGS reads.

De novo assembly of viral genome
The CLC Genomics Workbench V7.5.2 software (QIAGEN, Boston, MA, USA) was used for NGS raw data De novo assembling. Firstly, sequencing adaptors were trimmed off and contaminants sequences mapped to rRNA and mRNA reads were removed. The rest of clean reads were processed by software to get contigs. All ARV-related contigs were selected to build the full-length genome of ARV based on the BLASTN searching result. Each assembled segment was further upgraded by mapping back all NGS raw reads to the ARV-related contigs, and the consensus sequences were considered as the complete ARV genome.

Sequence analyses
The modules of EditSeq and MegAlign of DNASTAR Lasergene 12 Core Suite (DNASTAR, Inc. Madison, WI, USA) were used for viral open reading frames (ORFs) prediction, amino acid (aa) translation, sequence alignment, and pair-wised sequence comparison. An online search program (http://blast.ncbi.nlm.nih.gov/Blast.cgi) identified the highest similarities between the studied ARV genome segment and the published sequences.
Sequencing coverage, mapped reads, and intra-host single-nucleotide variants (iSNVs) of each assembled contigs were calculated and visualized by CLC Genomic Workbench V7.5 software (QIAGEN, Boston, MA, USA). Phylogenetic analysis of genome segments were carried out by using the neighbor-joining method in MEGA CC program [41] and the bootstrap validation method with 1000 replications. The visualization of genome alignment was performed using the mVISTA (http://genome.lbl.gov/vista/mvista/submit.shtml) and the scale sequence was using studied pheasant-origin ARV genome. Genome sequences of 13 ARV reference strains were retrieved from Genbank for sequence comparisons (S1 Table).

Sanger sequencing
One-step RT-PCR was used to test the pheasant ARV viral RNA using specific primers (P1/ P4) based on the S1 gene. As a result, the 1088bp PCR product was successfully amplified, and sanger sequencing results of showed the pheasant ARV strain (MZ561700) has about 88% nucleotides homology to those of other novel GoAstV strains, ARV strain in GenBank (L07069). In the next-generation sequencing analysis, the contigs of ARV were clearly mapped, whereas draft contigs to other-related viruses were not mapped.

Analyzing the NGS raw data
A total of 842,235 sequencing reads of 35-151-mer were generated on Miseq sequencer from extracted viral total RNA. The final NGS sequence data from the viral stocks output file in fastq format was 79.3Mb in size. Low-quality reads, trim poly-T tails and adapter sequences were processed by quality control (QC) filters of the Miseq platform for removal, A further screening of the NGS reads was carried out to remove the non-research target readings that were similar to mRNA or rRNA sequences. As a result, 420,914 reads (48.37%) were identified to be the chicken rRNA source and 109,731 reads (12.61%) to be the chicken mRNA source ( Fig 1A). The residual 339,525 clean reads were further analyzed using a BLASTN procedure, which further divided into no hits group (41,814 reads, 4.81%) and orthoreovirus group (297,711 reads, 34.21%) ( Fig 1A).

De novo assembly of viral genomes
After the assembly of no hits reads and orthoreovirus by de novo assembler of CLC Genomics Workbench software, a total of 72 contigs were generated with length from 150nt to 3958nt. The total reads counts of assembled contigs were found from 2 to 102,404 and the average coverage were from 1.08× to 6,506.34×. By searching online using BLASTN, 10 of the 72 contigs (Table 1) were identified as ARV-related contigs, ranging in length from 1192nt to 3958nt. The highest similarity search of 10 ARV contigs in Genbank revealed that all contigs had different homology (88.0%-96.6%) to other published reference ARV strains ( Table 1). The initial alignment of assembled contigs and most homology reference ARV segments indicated the full-length of 10 ARV segments including the 5' and 3' termini were individually targeted by 10 ARV contigs.

Sequencing coverage
By mapping back all NGS raw reads to the 10 assembled ARV contigs, the mapped reads and sequencing coverage of different genome segments were finally obtained. Although the mapped reads of each segment were various from 14,190 to 60,810, they were still positively correlated with the contig length (Table 1, Fig 1B). Sequencing coverage for each genome fragment averaged from 1390× to 1977×, indicating that there is sufficient redundancy to identify the PA136491 pheasant ARV genome. The sequencing coverage of each ARV fragment is illustrated by the wave-chart (Table 1, Fig 1C), with a peak (2937×) appearing in the L3 fragment.
Sequencing coverage and intra-host single nucleotide variants (iSNVs) were further analyzed using the resequencing analysis module of the CLC Genomic Workbench software. As a result, regions of high sequencing coverage were found throughout the ARV genome ( Fig 1C) by setting above 10 bp as the date aggregation value and a total of 7 iSNVs were determined in L1 and M2 segments (Table 1) by setting the sequencing error correction value of 0.4%.

General genome information
Finally, the complete sequences of 10 fragments of the pheasant antiretroviral genome were obtained and deposited into Genbank (MZ561694 to MZ561703). The SDWF17608 ARV genome fragment ranges from 1192bp (S4) to 3958bp (L1), with a total size of 23,495bp. The GC content displays variation between different genome segments, from 47% to 52%. There were 12 viral proteins encoded by one tricistronic segment (S1) and nine monocistronic segments. ORFs range in length from 3882 bp (λA) to 300 bp (p10), which were similar to the general characteristics of published ARV strains. Although the size of most proteins was identical between this pheasant ARV and the reference ARV strains, but the pheasant ARV nonstructural protein p10 encoding gene on S1 segment showed heterogeneity.

Sequence comparisons
In homologs comparison of the nucleotide (nt) and aa sequences, we found that the pheasant ARV had various similarities ranging from 42.5% nt to 99.6% nt and 29.7% aa to 99.7% aa (Table 3), which were obtained in pairwise comparisons with six ARV reference strains and seven non-ARV orthoreovirus strains (S1 Table). In addition, the comparison results of σC encoding genes revealed that was the most divergent gene between the SDWF17608 and other reference ARV strains by showing the extremely low nt and aa identities (nt: 42.5-64.1%; aa: The alignment of p10 (Fig 2A) and p17 (Fig 2B) proteins revealed that only p10 protein had a highly conserved region (aa 16 to 50) among the SDWF17608 pheasant ARV and other reference ARV strains.

Phylogenetic analysis
By the rooted maximum likelihood phylogenetic analysis, the phylogenetic trees of the evolutionary relationships between the pheasant ARV of SDWF17608 and other orthoreovirus members were confirmed. Fig 3 is the phylogenetic trees generated by aligning their nt sequences of the three L (L1-L3) and three M (M1-M3) genome segments and four σ-class genes. For L-class segments analysis, five host-associated groups were formed by the SDWF17608 strain and other reference strains in all three L segments (L1-L3).
All ARV strains were divided into three subclades, chicken ARV, a turkey ARV, and a waterfowl ARV, while all non-ARV strains from mammalian species were more distantly related to ARV strains in a separate branch The SDWF17608 strain was closely related to some historical ARV strains of chicken group I in L1 segment (Fig 3, L1), but clustered with pathogenic ARV strains of chicken group II in L2 and L3 segments (Fig 3, L2 and L3). whereas DAstV-1 formed another clade with pathogenic ARV strains in in L2 and L3 segments.
For M-class segments analysis, host-associated groups were also observed in M1 and M3 trees, but the SDWF17608 strain evolved distant from the grouped ARV strains and fell into the divergent strains group (Fig 3, M1 and M3). The M2 segment of the SDWF17608 strain classified as a lineage 2 member, together with the PA05682 ARV variant and the classic ARV 138 (Fig 3, M2).
The phylogenetic tree analysis based on σ-class gene indicated the variety relationships between SDWF17608 strain and the reference strain. Overall, the ARV and non-ARV orthoreovirus strains were clearly divided into two main branches, and antiretroviral strains can be further divided into distinct subgroups or clusters. (Fig 3).
For σB and σNS genes, SDWF17608 strain evolved distantly from all vaccine strains or classic strains and grouped with the newly emerged pathogenic ARV variants (Fig 3, σB and σNS). In contrast, the σA gene of SDWF17608 strain was not grouped with any classic or pathogenic ARVs, but evolved as a divergent variant strain (Fig 3, σA). As shown in Fig 3, the σC was the most diverse gene among all 10 ARV segment genes.
Phylogenetic analysis of σC genes among all ARV strains resulted in generation of five genotyping clusters, in which the SDWF17608 strain was grouped into genotyping cluster 5 similar to German and Israeli ARV strains [42].

Whole genome alignment
Whole-genome analysis of pheasant ARV and reference ARV strains was visualized by using the mVISTA online program (Fig 4). Eight of the 10 genome segments of the pheasant ARV and ARV 138 strains were broadly genetically related, with exclusion for the M3 segment and most 3' regions of the S1 segment which corresponding to σC gene. Nevertheless, the M3 and S1 fragments of SDWF17608 strain showed the highest identity to the ARV S1133 and PA05682 strains, respectively. Except for the high similarity in the M2 segment between PA22342 turkey ARV and SDWF17608, most of the other regions are of moderate sequence identity. The duck-origin J18 ARV shared low sequence identities with SDWF17608 strain throughout their genomes, and an even lower identity was observed in S1 segment (<50%), which indicated no segment reassortment between the duck-origin ARV and the pheasantsorigin ARV.

Discussion
ARV as a causative agent of the highly pathogenic and contagious viral arthritis disease of chicken, and ARV-related diseases continue to emerge and expand to other domestic and avian species. The newly emerging ARV field variants mainly affect broiler, layer and turkey productions in the past decade [16,17,43,44].
Numerous studies of the newly emerged ARV field variants suggest that a variety of ARV pathotypes or genotypes were circulating among different flocks and poultry spices [23,[45][46][47].
Although pheasant production is a relatively small-scale poultry business such as the game birds, pheasants can serve as a potential host repository for ARV transmissions and/or ARV genome reassortments to other avian species. This study provides data for the full-genome characterization of pheasant ARV, a novel pheasant ARV variant. By using the method of pairwise sequence comparison, we confirmed that all 10 genomic segments of SDWF17608 strain were nearly identical to the homologous segments of the reference ARVs.
The conserved regions at the 5' and 3' ends of the positive strand are the same as the ARV members, but there are some differences at the 5' end with the non-ARV reference strains, indicating that the genome segments of SDWF17608 strain were all classified into ARV species and did not undergo fragment reassortment with non-ARV strains [48].
The comparison of all coding genes showed that SDWF17608 strain was most closely related to the chicken-origin ARV strains (PA05682 and 138), and the sequence similarity of μB gene with PA05682 broiler antiretroviral strain was the highest (nt: 99.6%; aa: 99.7%). The above results indicated that the PA05682 strain was involved in the ARV genome rearrangement to generate the SDWF17608 strain. As the major outer capsid protein, μB is responsible for virus entry/un-coating and transcriptase activation [49]. The specific sequence of μB gene may be required for an efficient establishment of a successful ARV infection in a specific avian host. The comparison results showed that the SDWF17608 strain was highly identical to the PA05682 broiler antiretroviral μB gene, suggesting that although the SDWF17608 strain was derived from pheasants, it may have the ability to infect other birds. In contrast, σC was the most divergent gene with low sequence identity to the compared strains (nt: <64.1%; aa: <60.2%), which was mainly because the σC gene was the most variable protein in the ARV [50] and showed the less interaction with other viral proteins [42]. In addition, the σB gene encodes the viral in vitro capsid protein and is generally of low homology among different ARV strains. Interestingly, the comparison of σB gene of the SDWF17608 strain in the present study showed host-related identity values, suggesting that the σB gene could be a candidate for a genetic marker to distinguish ARVs from various avian hosts. The SDWF17608 strain expressed a high degree of diversity with classical ARV strains in its M1, M3 and S2 segments. The SDWF17608 strain was most closely related to PA05682 broiler ARV, and 138 strains were further confirmed by phylogenetic analysis in sequence comparison.
NGS is an important tool for assessing genetic diversity, providing high sequencing coverage for population samples such as viruses [51]. In the present study, the sequencing coverage of SDWF17608 strain was observed from 1390× to 1977× on average for different genomic segments. Such high sequencing coverage allowed us to calculate the reliable numbers of iSNVs in viral genome [52,53].
In this study, the iSNV information provided us a number of heterogeneous nucleotide positions of the SDWF17608 strain. Six of the 7 iSNV positions were found in the M2 segments of the μB gene, likely due to the location of the μB protein in the outer capsid leading to more immune selection for the μB gene. However, no heterogeneous sites were observed in the σC and σB genes, suggesting a lower level of spontaneous mutation in these two outer capsid proteins in the pheasant ARV strain (SDWF17608).
In conclusion, we have completed the whole-genome sequencing characterizations for this novel pheasant ARV (Reo/SDWF/pheasant/17608/20) through NGS technology on the Illumina Miseq platform. These findings provide scientific genomic data to better understand the genomic evolutionary relationships between different avian species-origin ARVs.
Supporting information S1 Table